Propagating Both Trust and Distrust with Target Differentiation for Combating Web Spam
نویسندگان
چکیده
Propagating trust/distrust from a set of seed (good/bad) pages to the entire Web has been widely used to combat Web spam. It has been mentioned that a combined use of good and bad seeds can lead to better results. However, little work has been known to realize this insight successfully. A serious issue of existing algorithms is that trust/distrust is propagated in non-differential ways. However, it seems to be impossible to implement differential propagation if only trust or distrust is propagated. In this paper, we view that each Web page has both a trustworthy side and an untrustworthy side, and assign two scores to each Web page: T-Rank, scoring the trustworthiness, and D-Rank, scoring the untrustworthiness. We then propose an integrated framework which propagates both trust and distrust. In the framework, the propagation of T-Rank/DRank is penalized by the target’s current D-Rank/T-Rank. In this way, propagating both trust and distrust with target differentiation is implemented. The proposed Trust-Distrust Rank (TDR) algorithm not only makes full use of both good seeds and bad seeds, but also overcomes the disadvantages of both existing trust propagation and distrust propagation algorithms. Experimental results show that TDR outperforms other typical anti-spam algorithms under various criteria.
منابع مشابه
Propagating Trust and Distrust to Demote Web Spam
Web spamming describes behavior that attempts to deceive search engine’s ranking algorithms. TrustRank is a recent algorithm that can combat web spam by propagating trust among web pages. However, TrustRank propagates trust among web pages based on the number of outgoing links, which is also how PageRank propagates authority scores among Web pages. This type of propagation may be suited for pro...
متن کاملA Novel Approach to Propagating Distrust
Trust propagation is a fundamental topic of study in the theory and practice of rankingand recommendation systems on networks. The Page Rank [9] algorithm ranks web pagesby propagating trust throughout a network, and similar algorithms have been designed forrecommendation systems. How might one analogously propagate distrust as well? This is aquestion of practical importance and...
متن کاملLink-Based Similarity Search to Fight Web Spam
We investigate the usability of similarity search in fighting Web spam based on the assumption that an unknown spam page is more similar to certain known spam pages than to honest pages. In order to be successful, search engine spam never appears in isolation: we observe link farms and alliances for the sole purpose of search engine ranking manipulation. The artificial nature and strong inside ...
متن کاملA Survey on Web Spam Detection Methods: Taxonomy
Web spam refers to some techniques, which try to manipulate search engine ranking algorithms in order to raise web page position in search engine results. In the best case, spammers encourage viewers to visit their sites, and provide undeserved advertisement gains to the page owner. In the worst case, they use malicious contents in their pages and try to install malware on the victim’s machine....
متن کاملWeb Spam, Propaganda and Trust
Web spamming, the practice of introducing artificial text and links into web pages to affect the results of searches, has been recognized as a major problem for search engines. It is also a serious problem for users because they are not aware of it and they tend to confuse trusting the search engine with trusting the results of a search [16]. The parallels between web spamming on the internet a...
متن کامل